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In this paper we explore the problem of counting solutions to conjunctive 
queries. Wc consider a parameter called the quantified star size of a formula if 
which measures how the free variables are spread in f. We show that for 
conjunctive queries that admit nice decomposition properties (such as being 
of bounded trccwidth or generalized hypcrtree width) bounded quantified 
star size exactly characterizes the classes of queries for which counting the 
number of solutions is tractable. This also allows us to fully characterize the 
conjunctive queries for which counting the solutions is tractable in the case of 
bounded arity. To illustrate the applicability of our results, we also show that 
computing the quantified star size of a formula is possible in time n'^^'^^ for 
queries of generalized hypertree width k. Furthermore, quantified star size is 
even fixed parameter tractable parameterized by some other width measures, 
while it is W[l]-hard for generalized hypertree width and thus unlikely to be 
fixed parameter tractable. We finally show how to compute an approximation 
of quantified star size in polynomial time where the approximation ratio 
depends on the width of the input. 

1 Introduction 

Conjunctive queries (CQs) are a fundamental class of logical queries that consist of 
evaluating an existential conjunctive first-order formula over a finite structure. They 
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admit a number of equivalent formulations for example as select-project-join queries in 
database theory or as homomorphism problems in constraint satisfaction and thus have 
been extensively studied in various contexts. Deciding if a Boolean CQ is true or not on 
a structure is well known to be NP-complete, so the main interest of study has been 
to identify tractable subclasses, so-called "islands of tractability" , where the decision 
question is tractable, i.e. can be solved in polynomial time. 

One main direction in finding tractable classes of CQs has been imposing structural 
restrictions on the formula of the query - more exactly on the hypergraph associated to 
it - while the database is assumed to be arbitrary. In a seminal paper Yannakakis [25J 
proved that if the formula is acyclic, then the Boolean CQ question becomes tractable. 
The main idea behind structural restrictions is to extend this result by generalizing it 
to "nearly acyclic" queries. This has lead to many different decompositions for graphs 
and hypergraphs and associated width measures (see e.g. [I31IH1E3])- The common 
approach for these decompositions is to group together vertices or edges (of the graphs or 
hypergraphs) into clusters of some fixed constant size and to arrange these clusters into 
a tree. The resulting width measures are often sought to have two desirable properties: 

• For every k the class of queries of width k should be tractable, i.e. Boolean CQ 
should be solvable in polynomial time. 

• Given an instance it should be possible to decide if there is a decomposition of 
width k and construct one if it exists. 

While decomposition techniques without the first property do not make any sense in 
the context of CQs, the second property is sometimes relaxed. For some decomposition 
techniques one does not actually need the decomposition to solve the Boolean query 
problem [^1, a promise of the existence is enough. For other decompositions one only 
knows approximation algorithms that construct decompositions of width that is near the 
optimal width, which is enough to guarantee tractability of Boolean CQ [221 E]- 

More recently there has also been interest in enumerating all solutions to CQs and in 
the corresponding counting question. For enumeration of the query answers it turns out 
that the picture is less clear than for decision [21 HI |T6] . Also the situation for counting is 
more subtle: For quantifier free queries - which correspond to queries without projections 
in the database perspective - most commonly considered structural restrictions yield 
tractable counting problems (see, e.g. [2¥j). While this is nice it is not fully satisfying, 
because quantifiers/projections are very natural and essential in database queries. While 
introducing projections does not make any difference for the complexity of Boolean CQ, 
the situation for the associated counting problem, denoted #CQ, is dramatically different. 
In [24J it is shown that even one single existentially quantified variable is enough to make 
counting answers to CQs ^^P-hard even when the structure of the query is a tree (which 
implies width 1 for all commonly considered decomposition techniques). This underlines 
the gain of expressive power obtained by existential quantification in the context of 
counting. It also follows that the decomposition techniques used for Boolean CQ are not 
enough to guarantee tractability for counting. 



2 



In a previous paper [10] the authors of this paper have proposed a way out of this 
dilemma for counting by introducing a parameter called quantified star size for acychc 
conjunctive queries (ACQs). This parameter measures how the free variables are spread 
in the formula. We represented a query formula <^(x) with a list x of free variables, by 
extending the hypergraph Ti = {V, E) associated to (/?(x) with a set S . Then the 
quantified star size is the size of a maximum independent set consisting of vertices from 
the set S in some specified subhypergraphs oiT-L. It turns out that this measure precisely 
characterizes the tractable subclasses of ACQs. The main result is that (under the widely 
believed assumption FPT ^ ^W[l] from parameterized complexity) solutions to a class 
of ACQs can be counted in polynomial time if and only if the queries in the class are of 
bounded quantified star size. 

Overview of the results 

Counting solutions to queries In this paper we extend the results of [10] from acyclic 
queries to commonly considered decomposition techniques. To do so we generalize the 
notion of quantified star size from acyclic queries to general conjunctive queries. We 
show that every class of CQs that allows efficient counting must be of bounded quantified 
star size - again under the same assumption from parameterized complexity. We then go 
on showing that for all decomposition techniques for CQs commonly considered in the 
literature combining them with bounded quantified star size leads to tractable counting 
problems. The key feature that makes this result work is the organization of atoms 
into a tree of clusters that is prominent in all decomposition methods for CQs known 
so far. Combining the results above we get an exact characterization of the classes of 
tractable CQ counting problems for commonly considered decomposition techniques. Let 
us illustrate these results for the example of generalized hypertree decomposition [13], 
which is one of the most general decomposition methods and one of the most studied 
too [131 113 [23]. We have that, under the assumption that FPT ^ #W[1], for any 
(recursively enumerable) class C of hypergraphs of bounded generalized hypertreewidth 
the following statements are equivalent: 

• #CQ for instances in C can be solved in polynomial time 

• C is of bounded quantified star size. 

In our considerations, the arity of atoms of queries is not a priori bounded. In this 
setting, there is no known ultimate measure resulting from a decomposition method 
that fully characterizes tractability even for Boolean CQ. This explains why our char- 
acterizations are stated for each decomposition method. For bounded arity however, 
the situation is diff'erent. It is well known that being of bounded treewidth completely 
characterizes tractability for decision [19^ [T7] and counting [9] for CSP (corresponding to 
quantifier free conjunctive queries in this setting). Combining [191 ITT] and our results 
from above we derive a complete characterization of tractability for #CQ in terms of 
tree width and quantified star size for the bounded arity case. 
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Note that our results are for counting with set semantics, i.e. we count each solution 
only once. Counting for bag semantics in which multiple occurences of identical tuples 
are counted has already been essentially solved in [24J. 

Discovering quantified star size To exploit tractability results of the above kind it is 
helpful if the membership in a tractable class can be decided efficiently, i.e. in our case if 
computing the quantified star size is also tractable. In the second part of the paper, we 
turn to these "discovery problems" of determining the quantified star size of queries. 

In |10j it is shown that quantified star size of acyclic CQs can be determined in 
polynomial time. Since star size is equivalent to independent sets, we cannot expect this 
to be true on more general queries anymore. Fortunately, it turns out that for queries of 
generalized hypertree width k, there is a n'^ algorithm that computes the quantified star 
size. We show that this is in a sense optimal, because under the assumption FPT 7^ W[l] 
there is no efficient (fixed paramater tractable in k) algorithm computing the quantified 
star size for queries parameterized by generalized hypertree width. 

Still some natural decomposition methods admit fixed parameter discovery algorithms. 
We prove that this is the case for the class of CQ having bounded hingetree width (see [S]). 
This result is interesting on his own from a hypergraph algorithms perspective. Because 
of the connection between star size and maximum independent set, it provides a new 
class of hypergraphs for which computing the maximum independent set is FPT. Note 
that the preceding hardness result shows that fixed parameterized tractability of this 
problem is unlikely for other hypergraph decomposition techniques. 

We then turn our attention to star size approximation. We show that there is a 
polynomial time approximation algorithm with ratio k that given a decomposition of 
width k runs in time independent of k. 

Summing these results up, quantified star size does not only imply tractable counting 
if combined with well known decomposition techniques, but in case the decomposition is 
given or can be efficiently computed (hypertreewidth, hingetree width) or approximated 
(generalized hypertreewidth), then computing quantified star size is itself tractable. 

Finally, we investigate the problem of counting solution and computing quantified 
star size for queries of bounded fractional hypertree width [181 122j . This decomposition 
method is of a somewhat different nature than the ones studied before so we treat it 
individually. We again prove that counting is tractable in this setting and that the 
discovery problem can be decided in 0{n''^^^^) i.e. with a slightly bigger dependency in k 
than before. 

2 Preliminaries 

Conjunctive queries We assume the reader to be familiar with the basics of (first order) 
logic (see [21j ) . We assume all formulas to be in prenex form. If </> is a first order formula, 
var(0) denotes the set of its variables, free(</>) C var(0) the set of its free variables and 
atom((/)) the set of its atomic formulas. Let x = xi, we denote ^(x) the formula 

with free variables x. If (p is such that hee{(j)) = var(0) then (p is said to be quantifier-free. 
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The Boolean query problem <I> = (A, (j)) associated to a formula </>(x) and a structure A, 
asks whether the set 

0(^) ={a: (Aa) |= 0(x)} 

called the query result is empty or not. The (general) query problem consists of computing 
the set (t){A)^ while the corresponding counting problem is computing the size of 4>{A)^ 
denoted by |0(^)|. We call two instances $ = {A, (j)), ^' = {A' , (j)') solution equivalent, if 
free((/)) = free(i;^>') and (/){A) = 4>'{A'). When <p is a {3, A}-first order formula the boolean 
query problem is known as the Conjunctive Query Problem, CQ for short. It is well 
known that the the Boolean CQ problem is NP-complete. We denote by #CQ the 
associated counting problem: given a query instance $ = {A^cp), return |(/)(^)|. 

Any a G (/"(.A) will be alternatively seen as an assignment a : free((/)) — )• D or as a tuple 
of dimension |free((/))|. Two assignments a and a' are compatible (symbol: a ~ a') if they 
agree on their common variables. 

Definition 2.1 Let (/>(x, y), ■0(y, z) be two conjunctive queries with xPi z = and let 
A, A' be two finite structures. The the natural join of 4> and if) is (j){A) cxi iIj{A!) := 
{(a,b,c) : (a,b) G (t){A) and (b, c) € i>{A!)} 

When A = A', (l){U) M ip{A) is simply [cj) A '4)\{A). 

Query size and Model of computation The underlying model of computation for 
our algorithms will be the RAM model with unit costs. We assume the relations of a 
conjunctive query to be encoded by listing their tuples. For a relation TZ let arity(7^) 
denote the arity of TZ and \R\ the number of tuples in IZ. Then the size of an encoding 
of TZ is ||7^|| := 0(arity(7^) • 17^1). For a vocabulary r let |t| be the number of predicate 
symbols. Finally, let \D\ be the size of a domain D. Then encoding a structure A over 
the vocabulary r with domain D takes space ||^|| := |r| + \D\ + X]7^er ll'^'^ll- 

Furthermore, it takes space := 0(l^pgatom(</)) ^^ity(P)) to encode a formula 4>. The 
size of an encoding of a CQ instance <I> = [4i,A) is then ||<I>|| := \\4>\\ + ||^||. 

For a detailed discussion and justification of these conventions see Section 2.3] 

Parameterized complexity This section is a very short introduction to some notions 
from parameterized complexity used in the remainder of this paper (for more details 
see [II]). 

A parameterized decision problem over an alphabet S is a language L C S* together 
with a computable parameterization k : S* — )• N. The problem (L, n) is said to be fixed 
parameter tractable, or (L, k) G FPT, if there is a computable function / : N — )• N such 
that there is an algorithm that decides for x G S* in time f {k{x))\x\'-^^^^ if x is in L. 

Let (L, k) and (L', k') be two parameterized decision problems over the alphabets 
S resp. n. A parameterized many-one reduction from {L,k) to (L',k') is a function 
r : S* — ^ n* such that for all x G S*: 

• X e r{x) G L', 
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• r(x) can be computed in time f{K(x))\x\^ for a computable function / and a 
constant c, and 

• K'{r{x)) < g{K{x)) for a computable function g. 

It is easy to see that FPT is closed under parameterized many-one reductions. 

Let p-Clique be the problem of deciding on an input {G, k) where G is a graph and k and 
integer, if G has a fc-clique. Here the parameterization k is simply defined by k(G, k) := k. 
The class W[l] consists of all parameterized problems that are parameterized many-one 
reducible to p-Clique. A problem (L, n) is called W[l]-hard, if there is a parameterized 
many-one reduction from p-Clique to (L, k). 

It is widely believed that FPT 7^ W[l] and thus in particular p-Clique and all 
W[l]-hard problems are not fixed parameter tractable. 

Parameterized counting complexity theory is developed similarly to decision complexity. 
A parameterized counting problem is a function F : S* x N — t- N, for an alphabet S. Let 
(x, /c) G S* X N, then we call x the input of F and k the parameter. A parameterized 
counting problem F is fixed parameter tractable, or F S FPT, if there is an algorithm 
computing F{x, k) in time f{k) ■ \x\'^ for a computable function / : N — )■ N and a constant 
c G N. 

Let F : S* X N — > N and G : 11* x N — ^ N be two parameterized counting problems. A 
parameterized parsimonious reduction from F to G is an algorithm that computes for 
every instance {x,k) of F an instance {y,l) of G in time f{k) ■ such that / < g{k) 
and F{x, k) = G{y, I) for computable functions /, 5 : N — N and a constant c e N. A 
parameterized T-reduction from F to G is an algorithm with an oracle for G that solves 
any instance (x, k) of F in time f(k) ■ |x|^ in such a way that for all oracle queries the 
instances {y,l) satisfy / < g{k) for computable functions f,g and a constant c G N. 

Let p-^^Clique be the problem of counting fc-cliques in a graph where k is the parameter 
and the graph is the input. A parameterized problem F is in ^W[l] if there is a 
parameterized parsimonious reduction from F to p-#Clique. F is #W[l]-hard, if there 
is a parameterized T-reduction from p-#Clique to F. As usual, F is #W[l]-complete if 
it is in 7^W[1] and hard for it, too. 

Again, it is widely believed that there are problems in 7^W[1] (in particular the 
complete problems) that are not fixed parameter tractable. Thus, from showing that a 
problem F is #W[l]-hard it follows that F can be assumed to be not fixed parameter 
tractable. 

Hypergraph decompositions In this section we present some well known hypergraph 
decompositions methods. For more details and more decomposition techniques see e.g. 

A (finite) hypergraph ^ is a pair (V, E) where ^ is a finite set and E C ViV). We 
associate a hypergraph H = {V^E) to a. formula cj) (the canonical structure describing (f) 
by setting V := var((/)) and E := {var(a) | a G atom(0)}. 
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Figure 1: The hypergraph associated to the formula (p of Example 
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Example 2.2 Consider the formula 

Pl{vi,Ui) A P2{V2,UI,U2) /\ P3{V2,V4,U2,U3) 
AP^ivs, Vi, V5,U3, n4, Us) A P5{Vi, ■Us, Vq, Vs) 
AP6{V7,VS,U5,U6) A P2{ve,Vg,U7) A P2 (^^8 , ^^9 , ^^s) 

The associated hypergraph is illustrated in Figure^ 

An independent set / in is a set of vertices / C y such that no two of them lie in 
one edge together. An edge cover C of is an edge set E' O E such that UeG-B' ^ = V. 

Definition 2.3 A generalized hypertree decomposition of a hypergraph % = {V, E) is a 
triple (T, {\t)t&Ti {xt)t£T) where T = (T, F) is a rooted tree and Xt Q E and Xt for 
every t £ T satisfying the following properties: 

1. For every v £V the set {t G T \ v £ xt} induces a subtree ofT. 

2. For every e £ E there is a t £ T such that e xt- 

3. For every t £ T we have Xt ^ UeeAt ^• 

The first property is called the connectedness condition. The sets Xt are called blocks or 
bags of the decomposition, while the sets Xt are called the guards of the decomposition. 
A pair {Xt,Xt) is called guarded block. 

The width of a decomposition {T, {Xt)teT, ixt)teT) is defined as max^grdAtl). The 
generalized hypertree width of % is the minimum width over all generalized hypertree 
decompositions ofH. 

We sometimes identify a guarded block {Xt,xt) with the vertex t. 
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Figure 2: A generalized hypertree decomposition of width 3 for the hypergraph from 
Figure [T] The boxes are the guarded blocks. In the upper parts the guards are 
given while the lower parts show the blocks. 



Example 2.4 Figure^shows a generalized hypertree decomposition of width 3 for the 
hypergraph from Figure [7} 

Definition 2.5 A hypergraph is acyclic if it has generalized hypertree width 1. In this 
case, the decomposition restricted to its blocks is called a join tree. 

Let us fix some notation: For an edge set A C we use the shorthand |J ^ •= UeeA ^• 
For a decomposition (T, {\t)t<^T-, {xt)t&T) we write Tt for the subtree of T that has t as 
its root. We also write x{Tt) := Ut'ey(rO 

Definition 2.6 A generalized hypertree decomposition is called hingetree decompositioij^ 

if it satisfies the following conditions: 

4. For each pair ti,t2 £ T with ti 7^ t2 there are edges ei £ Xt^ and 62 G Xt2 such that 
Xh ^Xta ^ eine2. 

5. For each t £ T we have \J Xt = Xt- 

6. For each e G E there is a t G T such that e £ Xt- 

Hingetree width (also called degree of cyclicity) is defined analogously to generalized 
hypertree width. 

Example 2.7 The decomposition from FigurelEis also a hingetree decomposition. 



^Note that this is not the original definition from 20 but an alternative, equivalent definition from [S]. 
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Definition 2.8 The primal graph of a hypergraph % = {V, E) is the graph Tip = {V, Ep) 
with Ep := {uv £ (^) \ 3e G E : u,v £ E} . 

Definition 2.9 A tree decomposition of a hypergraph % is a generalized hypertree 
decomposition of its primal graph Tip. The width of a tree decomposition is the size 
of its biggest bag minus 1. The treewidth of Ti is the minimum width over all tree 
decompositions ofH. 

For all decompositions defined above we define the width of a CQ-instance to be the 
width of the associated hypergraph. 

We now recall some known results on the various decomposition methods. 

Lemma 2.10 a) (see e.g. 18]) For all of the width measures defined above Boolean 
CQ-instances of width k can be solved in time W^W^^^^ for a polynomial p. 

b) (\20^) There is an algorithm that given a hypergraph % = iV^E) computes a 
minimum width hingetree decomposition in time \V\'^^^\ 

c) (lEI) Computing minimum width tree decompositions is fixed parameter tractable 
parameterized by the treewidth. 

d) (m \14V There is an algorithm that given a hypergraph % = iV^E) of generalized 
hypertree width k constructs a generalized hypertree decomposition of width 0{k) 
ofn in time 

Definition 2.11 Let % = {V,E) be a hypergraph and V' C V. The induced subhyper- 
graph niV] ofn is the hypergraph n[V'] = {V, {enV \ e e E,enV' ^ ^}). 

Let x,y € V, a path between x and y is a sequence of vertices x = vi, ...,Vk = y such 
that for each i £ [k — 1] there is an edge ei £ E with Vi,Vi+i £ e^. 

A (connected) component of 71 is the induced subhypergraph Ti[V'] for a maximal vertex 
set V such that for each pair x,y £ V' there is a path between x and y in Ti. 

Observation 2.12 Let (3 be any decomposition technique defined in this section. Let % = 
{V.,E) be a hypergraph of f3 -width k. Then for every V' '^V the induced subhypergraph 
T-L[V'] has 13-width at most k. 

Proof. Let (T, {Xt)t<^Ti ixtjunT) be a /3-decomposition oili of width k. For each guarded 
block (Ai, xt) compute a guarded block (A^, Xt) with xt '■= Xt^^' and \t := {er\V' \ e G A}. 
It is easy to check that (T, (Aj)tgT, {x't)tinT) is a /3-decomposition of width at most k. 

3 Quantified-star size 

In this section we generalize quantified star size which was introduced in [10] for acyclic 
conjunctive queries to general conjunctive queries. 
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Definition 3.1 Let % = {V^E) he a hypergraph and S QV. Let C he the vertex set of 
a connected component ofHlV — S]. Let Ec he the set of hyperedges {e € | e n C 7^ 0} 
and V' := Uee£;c ^' ^^^'^ ^[1^'] is called an S'-component ofTi. 

Definition 3.2 Let % = (V, E) he a hypergraph. For a set S the S-star size of 
% is the maximum size of an independent set consisting only of vertices in S in an 
S -component ofH. We say that this independent set forms the S-star. 

Example 3.3 Take S = {vi, ...,vg} in the hypergraph of Figure^ It has three S- 
components with respective edge lists: 

2. {u8,U9,n8}, {vs} 

3. {^6,^9,7x7}, {fe} 

The S-star size i.e. the size of a maximum independent of S -vertices in a S -component 
is 4. The set {vi^V2,v^,V'j} forms an S-star (there are several other possibilities). 

It is easy to see that for acyclic hypergraphs this definition of S-star size coincides 
with the definition in [lOj which was only defined for acyclic hypergraphs. 

Definition 3.4 An S-hypergraph is a pair {T-L,S) where % = iV^E) is a hypergraph 
and S* C y. To each formula (p we associate an S-hypergraph (T-L,S) where % is the 
hypergraph associated to (p and S := free{<j)). The quantified star size of a CQ instance 
$ = {A, (/)) is the S-star size of {Ti, S). 

Let "^star be the class of S'-hypergraphs {Tin, Sn), n £ N, where Tin is a star graph and 
Sn consists of its leaves. More precisely, Hn = {Vn, En), Sn are defined as 

• Vn = ...,?/„}, 

• En = {{z,yi} I i = 1, ...,n}, 

• Sn = {yi,....,yn]- 

We will use the following lemma from [10] to which we give an alternative simpler 
proof below. 

Lemma 3.5 ( |10j ) #CQ is ^W[l]-/iar(i restricted to instances that have S -hypergraphs 
in ^star parameterized hy the size of the stars. 

Proof. We show the hardness by a parameterized T-reduction from p-#Clique. The 
basic idea is that instead of counting /c-cliques in a graph, we can also count the /c-tuples 
of vertices that are not a clique. 
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So let G = {V, E) be a simple, undirected graph and /c G N. A tuple (ui, . . . , Vk) G 
is not a clique if and only if it there are i,j G [/c], i 7^ j such that ViVj is not an edge. 
Observe that because G is loopless this is necessarily true if {vi, . . . ,Vk) contains a double 
vertex. We will show how to check if a tuple . . . Vk) is a clique with a CQ-instance of 
the prescribed form. 

We construct a #CQ-instance = (^,0) with (/> := 3z Pi(z, Clearly the 
formula is of the right form. The domain oi A\s D :=¥ U {V xV x [k\ x [k]). For each 
i G \k] the structure ^ has the relation 

Pf" ■=^{{{v,w,i,j),v),{{w,v,j,i),v) \v,w eV 
V 7^ w,vw ^ E,j £ [k],j ^ i} 
U{VxVx{[k]\{t})xi[k]\{t}))xV. 

This completes the construction of <I>. 

First, observe that $ can be constructed in time polynomial in |G| and k, so if we can 
compute the number of /c-cliques of G from |0(^)| sufficiently quickly, the construction 
is indeed a parameterized T-reduction. 

Furthermore, observe that for each satisfying assignment the variables vi, . . . ,Vk take 
only values in V. We claim that an assignment a : {vi, . . . ,Vk} — )■ D satisfies (j) if and 
only if a{vi), . . . , a{vk) is not a clique of size k in G. Essentially, the quantified variable 
z here guesses the edge that is missing between Vi and vj. 

Indeed, if a{vi), . . . ,a{vk) is a tuple of vertices such that two vertices in it are not 
adjacent, say a{vi) ^ E, then assigning (xi, Xj,i, j) to z satisfies 

all atoms. 

Let on the other hand a{vi), . . . , a{vk) be a clique of size k in G. We claim that there 
is no assignment to z that satisfies all atoms. Clearly in a satisfying assignment z can 
take no value in V. So z must take a value inV xV x [k]x [k], say (v,w,i,j). But then 
in particular Pi{z,Vi) and Pj{z,Vj) are satisfied. It follows that a{vi) = v, aivj) = w, 
v,w ^ E, which is a contradictiton. So indeed, a{vi), . . . , a{vk) is a clique of size k in G 
if and only if a is a satisfying assignment. 

It follows that the number of cliques in G is ^(|^|'^ \ |0(^)|). But \V\'' and kl can be 
easily computed in time (/c|y|)'^(^) and thus one can compute the number of fc-cliques of 
G from |(/)|, G and k in time l^l)*^^^-* which completes the reduction. M 

4 The complexity of counting 

In this section we show that the decomposition techniques introduced in Section [2] lead 
to efficient counting when combined with bounded quantified star size. Furthermore, we 
show that bounded quantified star size is necessary for efficient counting under standard 
assumptions. 

Theorem 4.1 There is an algorithm that given a ^CQ-instance ^ = {A, 4>) of quantified 
starsize i and a generalized hypertree decomposition S = (T, {Xt)t<^T, ixt)teT) of ^ of 
width k counts the solutions of (p in time \\^\\p^^'^^ for a fixed polynomial p. 
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In the proof we will use the following lemma from [TO] . 



Lemma 4.2 For acyclic hypergraphs the size of a maximum independent set and a 
minimum edge cover coincide. Moreover, there is a polynomial time algorithm that given 
an acyclic hypergraph % computes a maximum independent set I and a minimum edge 
cover E* ofH. 



Proof, [of Theorem 4.1 Given = (A, (j)), we construct a solution equivalent instance 
in two steps which is of generalized hypertree width k, too, and has a quantifier free 
formula. 

Let Ti = {V,E) be the hypergraph of (p. Let Vi, . . . , Vm be the vertex sets of the 
components oiTilV — S] and let V(, . . . , be the vertex sets of the 5-components of 7i. 
Clearly, Vi C and V- —Vi = V-r\S =: Si. Let be the #CQ-instance whose formula 
(pi is obtained by restricting all atoms of 4> to the variables in V- and whose structure Ai 
is obtained by projecting all relations of A accordingly. The associated hypergraph of 
(pi is Ti [V-] and Ti [Vl] has a generalized hypertree decomposition Si of width at most k 



with tree a % that is a subtree of T (see Observation 2.12). 

For each $j we construct a new ^^CQ-instance = {A^,(p[) as follows. For each 
guarded block b = (A, x) G we construct a new atomic formula hi the variables x- 
The associated relation is given by 7r-^(ix</,eatom(*i): V3r{ij>)c[jx 4>) by taking the natural 
join of all relations whose variables are guarded in the guarded block and projecting 
on X- The formula (j)'^ for is obtained as the conjunction of all ipfj. The decomposition 
Si has width at most k so this can be done in time [[(I'lp^'^^. Obviously, <I>i and ^[ 
are solution equivalent. Furthermore (p[ is acyclic, because it has a decomposition with 
tree 7i, the same blocks as Si and width 1. Let Hi be the associated hypergraph of (p[, 
then T-Li has only one single S'j-component, because all the vertices in Vi are connected 
in % and thus also in Hi. Also the Sj-star size of T-Li is at most I. To see this consider 
two independent vertices u,v inT-li. The edges of Tii are equal to the blocks of Si, so 
u and V do not appear in a common block in Si. But then u and v cannot appear in 
one common block in S, because of T being a tree and the connectedness condition. 
So u and v are independent in T-L, too, and thus every independent set in T-Li is also 
independent in ^. So Tii indeed has /Sj-star size at most i. Thus the vertices in Si can 
be covered by at most £ edges ei, . . . , eg in Tii which we can compute in polynomial 



time by Lemma 4.2 Let ai, . . . ,a£ be the corresponding atoms. We again construct a 
new atomic formula (/)'■ in the variables Si only and an associated relation A'^ as follows: 
For each combination ti, . . . ,t£ of tuples in ai{A'i), . . . , ai{A'i) fix the free variables in 
(p'^ to the constants prescribed by the tuples ti, . . . ,ti if these do not contradict. If the 
resulting CQ instance has a solution, add the projection ti fxi ... txi tg on Si to the 
relation A'l of (p'-. By construction and {A'-, (p'-) are solution equivalent. Observe that 
the instances to be solved in this construction are tractable [5^, so all of this can be 
done in time H^ip'-'^'^^ for a polynomial p' . 

We now eliminate all quantified variables in the original formula (p. To do so we add 
the atom (p'- for i G [m] and delete all atoms that contain any quantified variable, i.e. 
we delete each (p[. Add the A'- to the structure A and call the resulting #CQ instance 
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= (^",0"). Because (^•', (/>'/) is solution equivalent to we have that <& and <I>" 
are solution equivalent, too. We construct a guarded decomposition of (j)" by doing the 
following: For each guarded block (A, x) of with x H 7^ we construct a guarded 
block (A', x') by deleting all edges e with e n 7^ from A and adding the edge Si for 
(/>f . Furthermore we set x' = (x ~ ^i) U 'S'j. It is easy to see that the result is indeed a 
generalized hypertree decomposition of ^" of width at most k. 

With standard techniques (see e.g. [8j) we construct in polynomial time a quantifier 
free acyclic ^^CQ-instance that is solution equivalent to Its solutions and thus those 
of <I> can then be counted with the algorithm in [21] or [10]. < 

We now show that bounded quantified star size is necessary for efficient counting no 
matter which other structural restrictions we put on 5-hypergraphs. 

Lemma 4.3 Let ^ he a recursively enumerable class of S -hypergraphs such that #CQ 
for all instances whose S-hypergraph is in ^ is fixed parameter tractable parameterized 
by the size of the formulas. Then has bounded S-star size or #W[1] = FPT. 

Proof, [sketch] The proof is a generalization of the respective proof in [lOj: We show 
that if the 5-star size of is not bounded, then there is an FPT algorithm for #CQ on 
^star^ the class of stars with a single quantified variable in the center. As this problem is 



#W[l]-hard by Lemma [3^ it follows that #W[1] = FPT. 

So assume that #CQ is tractable on ^ and has unbounded S'-star size. We will 
construct a fixed parameter algorithm for #CQ on '^star- So let <I> = {A, 99) be an instance 
of this latter problem, i.e. $ has the formula (p := 3z Ei{yi, z). Let the domain of 
A be D. Because is recursively enumerable and of unbounded S'-star size, there is a 
computable function (7 : N — )• N such that for k GN one can compute (^,5") E with 
5-star size at least k in time g{k). We will embed $ into Ti to construct an #CQ-instance 
<!•' = {A',il^) of size g{k)n^^^^ where n is the size of Furthermore, "0 will have the 
S-hypergraph % and A' the same domain D as A. For convenience, will be built on a 
language containing one distinct relation symbol for each hyperedges in %. 

Let %' be the S-component of % that contains k independent vertices in the respective 
primal component. Call these vertices si, . . . Sk- We may assume that the Si are also 
computed in time g{k) during the construction of %. Observe that there must be a vertex 
V that is connected to each of the Si by a path Pi such that the only vertex in Pi that 
is in 5* is Sj, because all the Sj lie in the same ^-component. We now construct a #CQ 
instance that has the associated S-hypergraph 71. 

All vertices that do not lie on any Pi are forced to a dummy value d in a straightforward 
way by all their constraints. All vertices on the Pi that are no Sj may take arbitrary 
but equal values in D. This is possible, because they are all connected to the common 
vertex v by paths. Let Vi be the predecessor of Si on Pj. For all constraints that contain 
Vi and Si we allow for them exactly the combinations allowed by the relation of E'f. 
Observe that there is no edge that contains more than one of the Sj by definition, so each 
constraint has at most |Dp tuples. 

Clearly, $ and have the same number of solutions. Furthermore, we have \tp\ < g{k) 
and can be constructed in time at most g(/c)||<I>|p, because Ti has size at most g{k) 
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and the size of the relations for the constraints is bounded by \D\'^ . But by assumption 
the solutions of can be counted in time H^'H'^ for a constant c and a computable 

function h. Thus the solutions of ^ can be counted in time for a constant 

c' which completes the proof. < 



With Theorem 4.1 and Lemma 4.3 we have a solid understanding of the complexity 
of i^CQ for structural classes that can be characterized by restrictions of generalized 
hypertree width. For each decomposition method with what Cohen et al. call the 
"tractable construction" property, i.e. there must be a way to construct a decomposition 
efficiently, quantified star size is essentially the only parameterization that makes counting 
tractable. For the definitions of decomposition techniques not defined in this paper see |13) . 

Corollary 4.4 Let (3 he one of the following decomposition techniques: hiconnected 
component, cycle-cutset, cycle-hypercutset, hinge-tree, hypertree, or generalized hypertree 
decomposition. Let furthermore ^ he a recursively enumerahle class of S -hypergraphs 
of bounded /3-width. Then counting solutions to all ^CQ-instances whose associated 
hypergraph is in ^ is tractable if and only if C is of hounded S-star size ( assuming 

FPT ^ #w[i];. 

5 An optimal result for bounded arity 

In this section we show that for bounded arity #CQ we can exactly characterize which 
classes of instances allow polynomial time counting. This result is derived by combining 
the results of the preceding sections and the following theorem from [19] that we rephrase 
in our slighlty different wording. 

Theorem 5.1 ([19j) LefS be a recursively enumerable class of hypergraphs of bounded 
arity. Assume FPT ^ W[l]. Then the following three statements are equivalent: 



Boolean CQ for all instances with hypergraphs in can be decided in polynomial 
time. 

Boolean CQ for all instances with hypergraphs in 'S is fixed parameter tractable 
parameterized by the size of the formulas. 

The hypergraphs in ^ are of bounded treewidth. 



Theorem 5.1 is originally stated to be true even for every fixed vocabulary. It has 
been generalized to any recursively enumerable class of conjunctive formulas [TTj. In 
this context, a characterization of tractability for counting solutions of quantifier-free 



conjunctive queries is given in [9j in almost the same terms as Theorem 5.1 but with the 
weaker assumption that FPT ^ #W[1]. We show here a complete characterization of 
tractability for counting for general conjunctive queries. Not too surprisingly, tractability 
depends on both treewidth and star size of the underlying hypergraph. 

Theorem 5.2 Let^ be a recursively enumerable class of S -hypergraphs of hounded arity. 
Assume that W[l] ^ FPT. Then the following statements are equivalent: 
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1. #CQ for all instances whose S-hypergraph is in ^ is solvable in polynomial time. 



2. #CQ for all instances whose S-hypergraph is in §f is fixed parameter tractable 
parameterized by the size of the formulas. 

3. There is a constant c such that for each S-hypergraph {T-L,S) in the treewidth of 
% and the S-star size are at most c. 

Proof. The direction [T] —t- [2] is trivial. Furthermore, [3] —T-jl] follows directly from Theorem 
4.1 So it remains only to show [2] — )• [3j 



So assume that there is a recursively enumerable class ^ of S'-hypergraphs such that 
counting solutions to #CQ-instances whose 5-hypergraph are in ^ is fixed parameter 
tractable but [3] is not satisfied by ^. From Lemma 4.3 we know that the S'-starsize of ^ 



must be bounded, so it follows that the treewidth of $f must be unbounded. 

We construct a class of hypergraphs by doing the following: For each 5-hypergraph 
i'HjS) in ^ we add % io ^' . Clearly 'S' is recursively enumerable and of unbounded 
treewidth. We will show that Boolean CQ for all instances whose hypergraphs are in ^' 
is fixed parameter tractable parameterized by the size of the formula. This leads to a 
contradiction with Theorem 15.11 

Because ^ is recursively enumerable, there is an algorithm that that for each % in ^' 
constructs an S-hypergraph ("H, S) in ^ that has lead to the addition of 7i to ^' . For 
example one can simply enumerate all S'-hypergraphs in $f until finding such a (T-L,S). 
Let f{T-L) be the number of steps the algorithm needs on input Ti. The function f(T-L) is 
well defined and computable. We then define 5 : N — )• N by setting g{k) := max-^(/(?^)), 
where the maximum is over all hypergraphs Ti of size A; in Clearly, g is again well 
defined and computable. Thus for each T-L in S^' we can compute in time 5(|^|) an 
S-hypergraph (?^,S) in 

Now let <I> = {A, (p) be a CQ-instance with hypergraph T-L in Sf' . To solve it we first 
compute {7i,S) as above and construct a #CQ-instance ^ = {A,^p) with {T-l,S) as 
associated S-hypergraph for ^ by adding existential quantifiers for all variables not in S. 
Obviously $ has solutions if and only if ^ has any. But by assumption the solutions of 
^ can be counted in time for some computable function h, so <I> can be 

decided in time ((7(|</)|) -|- /i(|</>|)) ||<1'||'^^^^ and thus is fixed parameter tractable. This is 
the desired contradiction to Theorem 15. 1[ < 



Remark 5.3 Note that our characterization relies on the underlying hypergraph struc- 
tures of the query. In fTl\ the corresponding characterizations are stronger in the 
sense that they are true for any recursively enumerable class of conjunctive formulas. 
Also these results and the one from 119] can be proved for every fixed vocabulary, while 
our proofs of the Lemmas \3.5\ and \4.3\ and thus also Theorem \5.!l^ rely on the fact that we 
can choose our vocabulary in the construction. It remains an open question whether our 
result can be improved similarly to the others. 

Also, the result in ^9] (for quantifier free #CQj is proved under the weaker assumption 
7^W[1] ^ FPT. Showing the same equivalent result for general #CQ seems to be hard 
since our case also contains decision problems (e.g. #CQ with no free variables). 
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6 Computing star size 



In this section we consider the problem of computing the quantified star size of bounded 
width instances. Observe that the computation of quantified star size is not strictly 



necessary. The algorithm of Theorem 4.1 does not need to find S'-stars for graphs of 



width k but only for acyclic hypergraphs, which is easy by Lemma [4.2[ Still it is of course 
desirable to know the quantified star size of an instance before applying the counting 
algorithm, because quantified star size has an exponential influence on the runtime. We 
show that for all decomposition techniques considered in this paper the quantified star 
size can be computed rather efficiently, roughly in where k is the width of the 

input. For small values of k, this bound is reasonable. We then proceed by showing that, 
on the one hand, for some decomposition measures such as treewidth or hingetree, the 
computation of quantified star size is even fixed parameter tractable parameterized by 
the width. On the other hand, we show that for decomposition measures above hypertree 
width it is unlikely that fixed parameter tractability can be obtained (under standard 
assumptions) . 

Instead of tackling quantified star size directly, we consider the combinatorially less 
complicated notion of independent sets, which is justified by the following observation: 

Observation 6.1 Let (3 he any decomposition technique considered in this paper. Then 
for every G N computing the S-starsize of S -hypergraphs of (3 -width at most k polynomial 
time Turing-reduces to computing the size of a maximum independent set for hypergraphs 
of (3 -width at most k. Furthermore, there is a polynomial time many one reduction from 
computing the size of a maximum independent set in hypergraphs of f3-width at most k to 
computing the S-star size of hypergraphs of f3-width at most k + 1. 

Proof. By definition computing S-starsize reduces to the computation of independent 
sets of S'-components. 5-components are induced subhypergraphs, so we get the first 
direction form Observation 12.121 

For the other direction let 71 = {V, E) he a hypergraph for which we want to compute 
the size of a maximum independent set. Let x ^ V. We construct the hypergraph T-L' 
of vertex set V = V U {x} and edge set E' = {eU {x} \ e £ E} and set S := V. The 
hypergraph is one single S-component, because x is in every edge. Furthermore, the 
iS-starsize of Ti' is obviously the size of a maximum independent set in T-L. It is easy to 
see that the construction increases the treewidth of the hypergraph by at most 1 and 
does not increase the /3-width for all other decomposition considered here at all. < 



Because of Observation 6.1 we will not talk about S-star size in this section anymore 



but instead formulate everything with independent sets. 
6.1 Exact computation 

Proposition 6.2 There is an algorithm that given a hypergraph % = {V, E) and a 

generalized hypertree decomposition S = (7~, (At)tgT) {xt)teT) ofH of width k computes 
a maximum independent set ofH in time k\V\^^^\ 
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Proof. We apply dynamic programming along the decomposition. Let b = (A, x) be a 
guarded block of T. Let % be the subtree of T with b as its root. We set Vf, := xiTb)- 
Observe that / C is independent in H if and only if it is independent in ^[H] so 
we do not differentiate between the two notions. For each independent set cr C x we 
will compute an independent set Ih^u C V5 that is maximum under the independent sets 
containing exactly the vertices a from x- Observe that because A contains at most k 
edges that cover x we have to compute at most kn'^ independent sets /^.o- for each b. 

If 6 is a leaf of T, the construction of the Jb^^ is straightforward and can certainly be 
done in time 

Let now b = (A, x) be an inner vertex of T with children 61 , . . . , 6^ • For each independent 
set cr C we do the following: Let bi = {\i,Xi)i then let ai be an independent set of 
Xi such that cr n x<^ Xi = (^i^ Xi ^^'^ l-^6j,o-J is maximal. We claim that we can set 

h,a := Cr U /fei^CTl U . . . U Ibr,ar- 

We first show that Ib^a defined this way is independent. Assume this is not true, then 
/ft^o- contains x, y that are in one common edge e in H[V{,]. But then x, y do not lie both 
in X, because H x = f and a is independent. By induction x, y do not lie in one V&. 
either. Assume that x £ x y G V5. for some i. Then certainly x ^ V^. and y ^ X- 
But the edge e must lie in one guard A' such that the corresponding block x' contains 
e. Because of the connectivity condition for y the guarded block (A', %') must lie in the 
subtree with root bi, which contradicts x G e. Finally, assume that x G . and y G Vb- 
for i ^ j and x,y ^ x- Then x and y cannot be adjacent because of the connectivity 
condition. This shows that Ib,a is indeed independent. 

Now assume that Ib,a is not of maximum size and let J C be an independent set 
with I J| > \Ib,a\ cind Jn X = c. Because J and Ib^a arc fixed to a on x there must be 
a bi such that | J fl Vft. | > |/6i,cTil- This contradicts the choice of a^. So Ib,a is indeed of 
maximum size. 

Because each block has at most k\V\^ independent sets, all computations can be done 
in time k\V\0'^''\ < 

6.2 Parameterized complexity 

While the algorithm in the last section is nice in that it is a polynomial time algorithm 
for fixed k, it is somewhat unsatisfying for some decomposition techniques: If we can 
compute the composition quickly, we would ideally want to be able to compute the star size 
efficiently, too. Naturally we cannot expect a polynomial time algorithm independent of 
k, because independent set is NP-complete, but we can hope for at least fixed parameter 
tractability with respect to k. We will show that this is indeed possible for some width 
measures, in particular tree decompositions and hingetree decompositions. On the other 
hand we show that this can likely not be extended to more general decomposition 
techniques, because independent set parameterized by hypertree width is W[l]-hard. 

Proposition 6.3 Given a hypergraph T-L computing a maximum independent set in T-i is 
fixed parameter tractable parameterized by the treewidth ofH. 
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This can be seen either by applying Courcelle's Theorem of by straightforward dynamic 
programming. Interestingly, one can show the same result also for bounded hingetree 
width, which is a decomposition technique in which the bags are of unbounded size. This 
unbounded size makes the dynamic programming in the proof far more involved than for 
treewidth. 

Proposition 6.4 Given a hypergraph % computing a maximum independent set in % is 
fixed parameter tractable parameterized by the hingetree width ofH. 

Proof. First observe that minimum width hingetree decompositions can be computed in 
polynomial time pOj . so we simply assume that a decomposition is given in the rest of 
the proof. 



The proof has some similarity with that of Proposition 6.2, so we use some notation 
from there. For guarded block (A, x) we will again compute maximum independent 
sets containing prescribed vertices. The difference is, that we can take these prescribed 
sets to be of size 1: because of the hingetree condition, only one vertex of a block may 
be reused in any independent set in the parent. The second idea is that we can use 
equivalence classes of vertices in the computations of independent sets in the considered 
guarded blocks, which limits the number of independent sets we have to consider. We 
now describe the computation in detail. 

Let S = (T, {\t)ti^T, (Xi)ieT) be a hingetree decomposition of V. of width k. Let 
b = (A, x) be a guarded block of S and let b' = (A', x') be its parent. As before, let % be 
the subtree of T with b as its root and Vf, := x{Tb)- Set Hb := (Vj,, Eh) with Ef, := |J A* 
with the union being over all guarded blocks in 71- The main idea is to iteratively 
compute, for all vertices v £ x' ^ ^ maximum independent set J^^b in T^b = (Vb, Eb) 
containing v. Furthermore, we also compute an independent set J0 b that contains no 
vertices of x' X- Note that, since x ^ Uega ^' there are no isolated vertices in x a-nd 
the size of a maximum independent set is bounded by k in each block. 

For a node b = (A, x)) we organize the vertices in x into at most 2^^ equivalence classes 
by defining v and u to be equivalent if they lie in the same subset of edges of A. The 
equivalence class of v is denoted by c(v). For each class, a representant is fixed. We 
denote by v, the representant of the equivalence class of v and by x ^ the restriction 
of X on these at most 2^ representants. 

Let first 6 be a leaf. We first compute independent sets on x- Observe that the 
independent sets are invariant under the choice of representants. For each equivalence 
class c{v), we compute J^^b C x as a maximum independent set containing v. Computing 
the classes and a choice of maximum independent sets containing each v can be done in 
time ^2^^ because independent sets cannot be bigger than k. Clearly, J^^b-, a maximum 
independent set containing v, can be easily computed from the set Jy^. Thus, one can 
compute all the Jy^b in time k2^ n. The computation of J^^b can be done on representants, 
too, by simply excluding the vertices from x' H x- 

Let b now be an inner vertex and bi,b2, .■.,bm be its children with bi = (Aj, Xi)' ^ ^ 
We again consider equivalence classes on x- Fix f G X and compute the list Ly^b of all 
independent sets o" C x containing v. Fix now a G Ly^b- We first compute a set J^j^ 
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as a maximum independent set of Tib containing v and whose vertices in x have the 
representants a. We will distinguish for a given vertex u G cr if it is the representant of a 
vertex belonging to the block of some (or several) children of b or if it represents vertices 
of x\(Ui!Li Xi) only. Therefore we partition a into a' , a" accordingly: 

• a := a'U a" 

• cr' := xn {u I u G \JT=iXi]- 

• cr" ■= X\{u I u G Uiti Xi] 

Set a' := {■ui, ...,'Ufe} with h < m. Let us examine the consequences of T being a 
hingetree decomposition. We have that, for all i G [m], there exists ej G A, such that 
X 1^ Xi ^ Cj. Thus, since a is an independent set in x ^ X) at most one vertex in a' is a 
representant of a vertex in Xj. Thus 

Vn 7^ G cj : Xi n c(m) = V Xi n c(u') = 0. (1) 

We denote by Si = {j \ c{ui) D Xj + 0} and by 5 = [m]\U5'i. By ([T]) the sets 
Si, Sh, S form a partition of [m]. To construct J^j,, we now determine for each i < h, 
which vertex u of c(ui) can contribute the most, by taking the union of all the maximum 
independent sets Ju,bj, j ^ Si, it induces at the level of the children of b. 

For each fixed u G c{ui), let 

Ii,u = {u}U [J Ju,bj, 

where we set Ju,bj ■= J%^bj if ^ ^ Xj- Let then li = li^u for some u G c{ui) for which the 
size of li^u is maximal. 

The set J^^ is now obtained as follows depending on whether u G ct" or -u G o"'. If 
V G a" , we claim that J^^ can be chosen as 

h 

:= {v} U {a"\{v}) U U /, U U J<l>,br 
li V G a' , say v = ui, we claim that J^^ can be chosen as 

h 

The set J^^b is taken as one of the sets J^^ of maximal size for a o" G L^ b- To compute 
J0 the arguments are similar. 

We first show that all Jy^b are indeed independent sets in lib- Clearly, it is enough to 
prove this for any J^^. There will be no reason to distinguish whether v G a" or v G a' , 
because our arguments will apply to all J^^^ independent of the choice of a distinguished 
element v. We will make extensive use of the two following facts. 



19 



• Let j,/ G [m] and / ^ 14. C Vb ., independent sets of Tib, and Tib' respectively. 
By the connectivity condition for tree decomposition we have 

ini' Qxjn xf n x- 

This permits to investigate the intersection of two independent sets /, /' by looking 
at their restriction on x- 

• Let now / C VJ, be an independent set of T-Lb ■ Then, / remains an independent set in 
Tib- Indeed, suppose there is a e G Eb\Eb- containing two vertices yi,y2 £ I- Since 
all edges must belong to a guard, there exists a node h* = (A*, x*) such that e G A*. 
Then, since in a hingetree decomposition we have x* = U then {yi, 2/2} ^ e ^ X*- 
But then, by the connectivity condition it follows that {2/1,2/2} ^ X- Hence, by the 
intersection property of hingetree decomposition, there exists Cj G Xj such that 

{2/1,2/2} ^ xnxj nej 

which implies that 2/1 and 2/2 are adjacent in T-Lby Contradiction. 

We now start the proof that J^^ is independent incrementally. Let i G [/i], n G c(nj) 
and j G Si and consider the set / := Ju,bj ■ By induction, the set I is independent in 
T-ibj- By the hingetree condition, there exists Cj G Xj such that x^Xj ^ ^j- By the 
connectivity condition, this implies x H / C e^. Then, since / is an independent set, no 
two vertices of x can belong to / i.e. |x ^ /| < L The connectivity condition also implies 
that, for j' / j, Vby n / C x H Xj> hence \Vb., H I| < 1 and I is an independent set of Hb- 
Finally, the set Ii = Uje5 Ju,bj is also an independent set of lib-, since for any distinct 

Ju,b, n Ju,b^, Q Xj n Xj' n x ^ e^. 

Hence Jub Jub -, contains at most one vertex (which is in x and could then only be 
u). 

Let now G [m] be distinct. By the arguments above, /j (resp. Ij/) contains at most 
one element u (resp. u') such that u G c{ui) (resp. u' G c(uj')). By Equation [T| we have 
that the two classes are distinct and that Ui ^ Ui' . But Uj ,Uii G a and a is independent 
in X- Hence, Ui,Ui' cannot be adjacent in Tib- Consequently, 

h 

U'- 

1=1 

is an independent set in Tib- 

Let j G S. Jijf^bj is independent in Tib^ and Jijf^bj ^ ^bj\x- Hence, Jijf^bj is independent 
in Tib- This also implies that, given / G [m] distinct from j, Jgj bj H Vb_^, = 0. Thus, 

h 

i=i ies 
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is independent in Hh. 

Finally, by construction, for all i £ [h], liDx = {u} with u = Ui £ a' . Also cr = o"' U g" 
is independent in x hence in ^5. No vertices y\ G Ij and £ cr" can be adjacent 
because, again, this would imply that {^1,^2} ^ X contradict the fact that 2/1,7/2 are 
independent in a. Thus J^^ is independent. 

We now prove that J^^h is of maximum size. Observe that it suffices to show this again 
for each J^^. Each maximum independent set J of Tib that contains v and whose vertices 
in X have exactly the representants a can be expressed as r U Ji U J2 U ... U J^. Here t x 
is an independent set of b containing v and whose representants are a. Furthermore, Jj is 
an independent set of Tib that contains only vertices of VJ,. . The set Jj may only contain 
one vertex Ui from x^^Xi- But then exchanging Jj for J^fii may only increase the size of 
the independent set, so we can assume that / has the form r U Jui,bi U Ju2,b2 U . . . U Jum,bm 
where Ui may also stand for 0. 

Assume now that J^^^ is not maximum, i.e. there is an independent set J containing v 
whose vertices in x have the representants a and J is bigger than J^^^. Then one of four 
following things must happen: 

• There is an i such that v £ Xi and J H VJ, . is bigger than J^ ^i ■ But this case cannot 
occur by induction. 

• V = ui and there is a j G Si such that v ^ Xj and | J n Vb^-| > | J0,bj |. By induction 
we know that Jti^bj is optimal under all independent sets of T-L^j not containing any 
vertex of Xj^X^ so there must be a vertex u £ J H x H Xj • Since J is independent, v 
and u share no edge in A and then v ^ u. Since j £ Si, it holds that c{v) H Xj / 
and by Equation [l] c{u) n Xj = 0- Contradiction. 

• There is an i S 5 such that J n Vj,. is bigger than J0 j,. . But from i £ S it follows 
by definition that x H Xi H J = 0, so this case can not occur by induction, either. 

• There is an i G [h] such that \Jr](\Jj^g. Vj)\ > \Ii\. We claim that (UjeSi XjO^xH J 
contains only one vertex. Assume there are two such vertices x and y. By definition, 
x,y £ f. Since J is independent, x and y are not adjacent in x and x ^ y. At least 
one of these, say y, must be in c{ui), because Uj G f by definition. Let x £ Vji with 
j' £ Si, then there is a vertex w £ c{ui) = c{y) in Xj' H x ^ Cj by definition of Si. 
But then x and y are adjacent in x which is a contradiction. 

So there is exactly one vertex u in {{Jj^g. Xj)'^X'^J- But then | Jn(|J^g^^ Vj)\ > li^u- 
Thus either there must be a j G Si with u £ Vj such that | J n > \Ju,bj\ or 
there must be a j G Si with u ^ Vj such that | J n Vj- 1 > | Jijf^bj I • The former 
clearly contradicts the optimality of J^.fe > while the latter leads to a contradiction 
completely analogously to the second item above. 

Because only k2^^Tn? sets have to be considered for each guarded block, this results in 
an algorithm with runtime k2^ \V\^^^\ 
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Figure 3: We illustrate the construction for Lemma 6.5 by an example. A graph G on 
the left with the associated hypergraph % for A; = 4 on the right. To keep the 
illustration more transparent the edge sets Eij are not shown except for £'1^2 
and E2 1 . 



Lemma 6.5 Computing maximum independent sets on hypergraphs is W[l]-hard pa- 
rameterized by generalized hypertree width. 

Proof. We will show a reduction from p-IndependentSet which is the following problem: 
Given a graph G and an integer k which is the parameter, decide if G has an independent 
set of size k. Because p-IndependentSet is well known to be W[l]-hard, this suffices to 
establish W[l] -hardness of independent set parameterized by hypertree width. 

So let G = {V, E) be a graph and let /c be a positive integer. We construct a hypergraph 
% = iy' , E') in the following way: For each vertex v the hypergraph % has k vertices 
wi, . . . , Vk. For i = 1, . . . ,k we have an edge Vi := {vi \ v G V} in E' . Furthermore, 
for each v £ V we add an edge Hy := {vi \ i G [k]}. Finally we add the edge sets 
Eij := {viUj I uv G i?} for i,j G [k]. % has no other vertices or edges. The construction 
is illustrated in Figure [3j 

We claim that G has an independent set of size k if and only if % has an independent 
set of size k. Indeed, if G has an independent set v^^ . . . ^ then v\, . . .v^ is easily seen 
to be an independent set of size k uiT-L. Now assume that % has an independent set 
/ of size k. Then for each v €z I we can choose a vertex 7r(w) G V such that v G H^(^yy 
Furthermore for distinct v,u £ I the corresponding vertices tt{v),tt{u) have to be distinct, 
too, so 7r(/) C V has size k. Finally, we claim that 7r(I) is independent in G. Assume 
this is not true, then there are vertices 7r(w),7r(u) such that 7r(t;)7r(n) G E. But then 
vu G E' by construction which is a contradiction. So, indeed G has an independent set 
of size k if and only if Ti has one. 

We now show that Ti has generalized hypertree width at most k by constructing a 
generalized hypertree decomposition (T, {y^t)teT, (xt)teT) of H of width k. The tree 
T only consists of one single vertex v, the block of f is Xs; •= the guard is 

Xt := {Vi, . . . ,Vk}. It is easily seen that this is indeed a hypertree decomposition of 
width k. 
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Observing that the construction of % from G can be done in time polynomial in |y| 
and k completes the proof. A 



6.3 Approximation 

We have seen that computing maximum independent sets of hypergraphs with decom- 
positions of width k can be done in polynomial time for fixed width k and that for 
some decompositions it is even fixed parameter tractable with respect to k. Still, the 
exponential influence of k is troubling. In this section we will show that we can get rid of 
it if we are willing to sacrifice the optimality of the solution. We give a /^-approximation 
algorithm for computing maximum independent sets of graphs with generalized hypertree 
width k assuming that a decomposition is given. We start by formulating a lemma. 

Lemma 6.6 Let % he a hypergraph with a generalized hypertree decomposition S = 
{T, iXt)teT, ixt)teT) of width k. Let W = (F, E') where E' := {xt | t G T}. Let £ be the 
size of a maximum independent set in % and let i' he the size of a maximum independent 
set in %' . Then 

k 



Before we prove Lemma 6.6 we will show how to get the approximation algorithm from 

it. 



Observation 6.7 Every independent set ofH' is also an independent set ofTi. 

Proof. Each pair of independent vertices x,y in T-L' is by definition only in different 
blocks xt in For each edge e £ E there must by definition of generalized hypertree 
decompositions be a block x such than e Q x- Thus no edge e € E can contain both x 
and y, so x and y are independent in T-l, too. < 



Corollary 6.8 There is a polynomial time algorithm that given a hypergraph % and a 
generalized hypertree decomposition of width k computes an independent set of size t of 
Ti. such that \I\ > | where i is the size of a maximum independent set ofH. 



Proof. Observe that T-L' is acyclic. With Lemma 4.2 we compute a maximum independent 
set / of %' whose size by Lemma 6.6 only differs by a factor ^ from £. By Observation 



we have that I is also an independent set of H. 



6.7 



Proof of Lemma \6.6[ The second inequality follows directly from Observation 6.7 

For the first inequality consider a maximum independent set L oi T-L. Observe that a 
set /' is an independent set of H' if and only if it is an independent set of its primal 
graph Tip, so it suffices to show the same result for Tip. 

Claim 1 T-L'plL] has treewidth at most k — 1. 
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Proof. We construct a tree decomposition from S. To do so consider which for 
each guarded block (A, x) of S contains (A', x') where A' := {e H / | e G A, e H / 7^ 0} and 
x' ■= X'^ I- The set / is independent, so each guard of is a set of singletons and if 
follows \x'\ < |A'| for each guarded block (A',x')- 

Let T[I] be the tree of induced by T in the obvious way. Then the blocks 
x' = xn / still fulfill the connectedness condition. Furthermore, for each edge uv in %'[!] 
there is a guarded block (A',x') such that u,v £ x' ■ Thus is a tree decomposition 
of T-L'p[I]. But we have that |x'| < |A'| < |A| < k and thus the tree decomposition is of 
width at most k — 1. < 



Claim 2 TiplI] has an independent set I' of size at least 



m 



Proof. From Claim [T] it follows that Ti-'[I] and all of its subgraphs have a vertex of 
degree at most k (see e.g. [Ill p. 265]). We construct I' iteratively by choosing a vertex 
of minimum degree and deleting it and its neighbors from the graph. In each round we 
delete at most k vertices, so we can choose a vertex in at least ^ rounds. Obviously the 
chosen vertices are independent. < 

Every independent set oiTip [I] is also an independent set oiTip which completes the 



proof of Lemma 6.6 



7 Fractional Hypertree width 

In this section we extend the main results of the paper to fractional hypertree width, 
which is the most general notion known so far that leads to tractable Boolean CQ |18j . 
In particular it is strictly more general than generalized hypertree width. 

Definition 7.1 Let % = {V,E) he a hypergraph. A fractional edge cover of a vertex set 
S '^V is a mapping ip : E ^ [0,1] such that for every v £ S we have J2e<^E-v^e V'(^) ^ 1- 
The weight of ip is YleeE'^i^)- fractional edge cover number of S, denoted by Py^{S) 
is the minimum weight taken over all fractional edge covers of S. 

A fractional hypertree decomposition of Ti is a triple (T, (xt)teT; ('0t)teT) where 
T = (T, F) is a tree, and Xt '^V (md ipt 'is a fractional edge cover of xt for every t £ T 
satisfying the following properties: 

1. For every v £V the set {t £T \v £ xt} induces a subtree ofF. 

2. For every e £ E there is a t £ T such that e C xt- 

The width of a fractional hypertree decomposition (T, {xt)teTi {ipt)tinT) 'is maxtgT(/9?^(xt)). 
The fractional hypertree width of % is the minimum width over all fractional hypertree 
decompositions of%. 

Together with the previous results of this paper, the two following ones will serve as 
key ingredients to prove the main results of this section. 
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Theorem 7.2 ([18j) The solutions of a CQ instance <I> with hypergraph % can he 
enumerated in time 

Theorem 7.3 ([22]) Given a hypergraph % and a rational number w > 1, it is possible 
in time to either 

• compute a fractional hypertree decomposition ofH with width at mots Ttu^ + Slw + T, 
or 

• correctly conclude that fhw(?^) > w. 
7.1 Tractable counting 

We start of with the quantifier free case which we wih use as a building block for the 
more general result later. 

Lemma 7.4 The solutions of a quantifier free CQ instance $ with hypergraph % can be 
counted in time ||<I)||^™(^)'^*^\ 



Proof. With Theorem 7.3 we can compute a fractional hypertree decomposition 
(T, {Bt)t ^T, { i^t)teT) of width at most k := 0(fhw('H)^). For each bag Bt we can with 
in time compute all solutions to the CQ ^[Bt] that is induced by the 



7.2 



Theorem 

variables in Bt. Let these solutions form a new relation TZt belonging to a new atom ipt. 
Then ipt{Bt) gives a solution equivalent, acyclic, quantifier free #CQ instance of 
size ||<I)||'^('^). < 



We can now formulate a version of Theorem 4.1 for fractional hypertree width. 



Theorem 7.5 There is an algorithm that given a ^CQ-instance ^ of quantified starsize 
i and fractional hypertree width k counts the solutions of <I> in time for a 

polynomial p. 



Proof. This is a minor modification of the proof of Theorem 4.1 Let 71 = {V, E) be 
the hypergraph of Because of Theorem |7.3| we may assume that we have a fractional 
hypertree decomposition S := (T, {xt)t£T, {'4't)t£T) of width k' := k^^^^ of Ti where T-L is 
the hypergraph of For each edge e € E we let (p{e) be the atom of $ that induces e. 

Let Vi, . . . , Vm be the vertex sets of the components oiH — S and let V/, . . . , be 
the vertex sets of the S-components of H. Clearly, Vi C y/ and ¥( — Vi = ¥( r\ S =: Si. 
Let be the restriction of <I> to the variables in Vl and Let Si be the corresponding 
fractional hypertree decomposition. Then Si has a tree 71 that is a subtree of T. 

For each <I>i we construct a new #CQ-instance by computing for each bag Bt a 
constraint ip in the variables Bt that contains the solutions of ^> j [Bt\ . The decomposition 



S has width at most k' so this can be done in time n^'^^'^ by Theorem 7.2 Obviously 
and are solution equivalent and is acyclic. Furthermore, $^ has only one single 
iSj-component, because all the vertices in Vi are connected in $ and thus also in Let 
T-Li be the hypergraph of then T-Li has Sj-star size at most i. Thus the vertices in 
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Si can be covered by at most £ edges in Hi by Lemma 4^ Pick i such edges ei, . . . , e^. 
We construct a new atom ipi in the variables Si that is solution equivalent to by 
doing the following: For each combination ti, . . . , of tuples in ^p{ei), . . . , ^p{ei) fix the 
free variables in <I>'^ to the constants prescribed by the tuples ti, . . . ,t£ if these do not 
contradict. If the resulting ACQ instance has a solution, add ti ixi . . . m to the relation 

of (pi- 

We now eliminate all quantified variables in To do so we add the constraint (pi for 
i G [m] and delete all constraints that contain any quantified variable, i.e. we delete each 
Call the resulting t^CQ instance Because ipi is solution equivalent to we have 
that $ and are solution equivalent, too. 

We then construct a fractional hypertree decomposition of by doing the following: 
we set Bi = {Bt \ Uig/, Vi) U U,e/, Si for each bag Bt where h := {i\Btr\Vi^ 0}. For 
each bag Bt we construct a fractional edge cover ip^ of B[ by setting ip'tle) := ipt{^) for 
all old edges and setting il)t{Si) = 1 ioi i ^ It where Si corresponds to the newly added 
constraint (pi with Btr\Vi ^ ^. The result is indeed a fractional edge cover of width at 
most k' , because each variable not in any Si is still covered as before and the variables in 
Si are covered by definition of tpf Furthermore, we claim that the width of the cover is 
at most k' . Indeed, for each i G / we had for each v £ Vi YleeE-vee ''Pi^) ^ 1- None of 
these edges appears in the new decomposition anymore. Thus adding the edge Si with 
weight 1 does not increase the total weight of the cover. It is now easy to see that doing 
this construction for all Bt leads to a fractional hypertree decomposition of <!>' of width 
at most k' . 



Applying Lemma 7.4 concludes the proof. M 



7.2 Computing independents sets 

Also S'-star size or equivalently independent sets of bounded fractional hypertree width 
hypergraphs can be computed efficiently. 

Lemma 7.6 The independent sets of a hypergraph % = {V, E) can he enumerated in 
time \n\°^P*^^^^\ 

Proof. Let % = {V,E). We construct a conjunctive query <I> with the hypergraph %. 
Let V be the variables of {0, 1} the domain and add a relation IZe for each e £ E. The 
relation TZe has all tuples that contain at most one 1 entry. Finally, $ has the formula 

Clearly, <I> has indeed the hypergraph 71. Furthermore the solutions of $ are exactly the 
characteristic vectors of independent sets of Thus we can enumerate all independent 
sets of H in time \T-L\'-^^p ^ with Theorem 7.2 M 



Lemma 7.7 There is an algorithm that given a hypergraph % = iV^E) of fractional 
hypertree width at most k computes a maximum independent set ofH in time \T-L\^'^'^^\ 
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Proof. Dynamic programming along a fractional hypertree decomposition. In a first 
step we compute a fractional hyp ertree decomposition (T, (i?j)tgT, {4't)t&T) of width 
k' = k^^^^ of Ti with Theorem 7.3 For each bag Bt we then compute all independent 
sets of HlB] with Lemma 7.6, call this set It- 

By dynamic programming similar to the proof of Lemma 6.2 we then compute a 
maximum independent set oi Ti. 



8 Conclusion 

The results of this paper give a clear picture of tractability for counting solutions of 
conjunctive queries for structural classes that are known to have tractable decision 
problems. Essentially counting is tractable if and only if these classes are combined 
with quantified star size. So to find more general structural classes that allow tractable 
counting, progress for the corresponding decision question appears to be necessary. 

Another way of generalizing the results of this paper would be extending the logic that 
the queries can be formulated in. Just recently Chen and Dalmau [7J have characterized 
the tractable classes of bounded arity QCSP which is essentially a version of CQ in 
which also universal quantifiers are allowed. They do this by introducing a new width 
measure for first order {V, 3, A}-formulas. We conjecture that their width measure also 
characterizes the tractable cases for ^^QCSP, i.e. tractable decision and counting coincide 
here. It would be interesting to see how far this can be pushed for the case of unbounded 
arity. 

Another extension appears in a recent paper by Chen [5] where he considers existential 
formulas that may use conjunction and disjunction. This is particularly interesting, 
because it corresponds to the classical select-project-join queries with union that play an 
important role in database theory. One may wonder if Chen's results may be extended 
to counting, too. 
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